From Detection/Correction to Computer Aided Writing

نویسندگان

  • Damien Genthial
  • Jacques Courtin
چکیده

Most texts nowadays are produced in an electronic form by the use of systems which provide text processing facilities but also linguistic facilities such as spelling checkers, on-line lexicons and even syntactic checkers. We think that a computer-aided writing system must be designed as a complete environment for the production, maintenance, edition and communication of texts. This implies for example the use of an ideas manager and on-line lexicons for production, a text editor and linguistic verifiers for maintenance, a text processor for edition and a standardized form for communication. Following our work on detection and correction of errors, we propose an architecture of a system able to integrate in a uniform way our linguistic tools (morphological parsing and generation, lexical correction techniques, syntactic parser and verifier) as well as tools for text processing and document editing and exporting. Tools are designed as specialized modules disposed around a unique data structure, which is the intemal representation of the text. This structure is a multi-dimensional lattice, coding the linearity but also the structure and the ambiguities of the text. It is completed by a lexicon based on typed feature structures encoding morphological, syntactic and semantic information on words. The distribution of the competence of the system in specialized modules permits an easier maintenance of the system itself but, moreover, allows competence sharing among the modules, which is very important for the linguistic ones (for example the syntactic verifier needs to use ahnost every linguistic module: morphology, phonetic, syntax). 1, I n t roduc t i on In their life-cycle from creation to publishing, all texts nowadays take an electronic form. Most of them arc directly produced in this form and take the paper form only for publishing. Thus a lot of services can be provided to the writer who uses a computer to produce his texts. This idea is not new but, following our work on detection and correction of errors, we think it must be investigated more deeply than it has been. We first introduce what we mean by computer aided writing. We then propose an architecture for a computer aided writing environment and quickly describe its modules. We outline one of its main characteristics (limited data structures), and finally justify the second one (distribution of services) in the light of our work on detection and correction of errors. 2. C o m p u t e r Aided W r i t i n g ( C A W ) A computer system for a writer is basically a personal computer which runs a text processor, the power increase of personal computers has been followed by the growth of services provided to the user. Some of these services aim to increase the writers productivity but most of them aim to obtaining a better quality of produced documents. We will distinguish here between two categories of services: presentation services and production services. The fwst o n e s concern the way the paper form of the text looks: justification, formating, multi-column... They are very powerful in modem systems, especially if you add to your text processor a graphic processor and a page maker, but they have little to do with linguistics and so we will not discuss them here. The second ones concern the text itselt, in its content and in its form. The best known and most achieved service in this category is the spelling checker, which can be found in every modern text processor. Recently, other services have emerged: • on-line lexicons with synonym and antonym links; • idea managers which help the user to build the plan of his document; • syntactic checkers in the spirit of the IBM system CRITIQUE [6]. In most cases, these new services are a dd-o ns to an existing text processor and CAW s y s t e m s are stacks of tools, lacking the coherence of an integrated approach. Our idea is that CAW must be thought of as a goal in itself and our aim is to build an environment for the production, maintenance, edition and communication of texts. Such a system will be based on a coherent set of software tools reflecting the state of the art in string manipulation and linguistic treatment. At a first glance, the system should include classic and well-known tools such as those cited above and more sophisticated tools like: • morphological analysis and generation, which can for example be used for lemmatization of words or groups of words. The idea here is to use these lemmatized groups as keys to access external knowledge bases or document bases [91. • syntactico-semantic analysis and generation to allow operations like: changing the tense of a paragraph, changing the modality of a sentence, help in detecting ambiguous phrases and in disambiguation by proposing paraphrases. There is also the possibility of generating a definition of a word on the basis of its formal description in the lexicon. ACRES DE COLING-92, NANTES. 23-28 AOtYr 1992 l 0 i 4 PROC. OF COLING-92, NANTES. AUG. 23-28, 1992 • lexical and syntactic checkers , which mus t also be able to propose corrections, by the use of all the linguistic knowledge included in the system. • structural manipulations of the text in the spirit of idea managers but also some verifications on the structure by the use o f a g rammar of the text, which depends on the type o f document created. For example , a software documentation will include a user manual and a reference manual , the user manua l will include an instal lat ion chapter , a tutorial introduction chapter ..... • interface with the outside world: that includes of course the production of a paper form of the text but also, at least as important as the former, the production o f the text in some s tandardized form (for example the form caracteristics are the use of a minimal number o f data structures and a distr ibuted architecture. We will here quickly describe the role of each module , leaving for the next two sections the d i s c u s s i o n a b o u t da t a s t r u c t u r e s a n d architectural choices. The proposed sys t em is pr imari ly buil t for French but every module has been designed to be as general as possible, and is complete ly configurable, so that it can be used for other

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Survey on Perception of People Regarding Utilization of Computer Science & Information Technology in Manipulation of Big Data, Disease Detection & Drug Discovery

this research explores the manipulation of biomedical big data and diseases detection using automated computing mechanisms. As efficient and cost effective way to discover disease and drug is important for a society so computer aided automated system is a must. This paper aims to understand the importance of computer aided automated system among the people. The analysis result from collected da...

متن کامل

THE IMPACT OF USING COMPUTER-AIDED ARGUMENT MAPPING (CAAM) ON THE IMPROVEMENT OF IRANIAN EFL LEARNERS’ WRITING SELF-REGULATION

The present study was conducted to investigate the impact of using computer-aided argument mapping (CAAM) on the improvement of Iranian learners’ writing self-regulation. To this end, 90 participants out of 127 senior university students in English translation were selected after administrating language proficiency test, as well as an essay writing test for the purpose of homogenizing the learn...

متن کامل

A New Computer-Aided Detection System for Pulmonary Nodule in CT Scan Images of Cancerous Patients

Introduction: In the lung cancers, a computer-aided detection system that is capable of detecting very small glands in high volume of CT images is very useful.This study provided a novelsystem for detection of pulmonary nodules in CT image. Methods: In a case-control study, CT scans of the chest of 20 patients referred to Yazd Social Security Hospital were examined. In the two-dimensional and ...

متن کامل

DIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION

Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...

متن کامل

A Hierarchical Classification Method for Breast Tumor Detection

Introduction Breast cancer is the second cause of mortality among women. Early detection of it can enhance the chance of survival. Screening systems such as mammography cannot perfectly differentiate between patients and healthy individuals. Computer-aided diagnosis can help physicians make a more accurate diagnosis. Materials and Methods Regarding the importance of separating normal and abnorm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992